Search CORE

14 research outputs found

Vaccination shapes evolutionary trajectories of SARS-CoV-2

Author: Lässig Michael
Meijers Matthijs
Ruchnewitz Denis
Łuksza Marta
Publication venue
Publication date: 19/07/2022
Field of study

The large-scale evolution of the SARS-CoV-2 virus has been marked by rapid turnover of genetic clades. New variants show intrinsic changes, notably increased transmissibility, as well as antigenic changes that reduce the cross-immunity induced by previous infections or vaccinations. How this functional variation shapes the global evolutionary dynamics has remained unclear. Here we show that selection induced by vaccination impacts on the recent antigenic evolution of SARS-CoV-2; other relevant forces include intrinsic selection and antigenic selection induced by previous infections. We obtain these results from a fitness model with intrinsic and antigenic fitness components. To infer model parameters, we combine time-resolved sequence data, epidemiological records, and cross-neutralisation assays. This model accurately captures the large-scale evolutionary dynamics of SARS-CoV-2 in multiple geographical regions. In particular, it quantifies how recent vaccinations and infections affect the speed of frequency shifts between viral variants. Our results show that timely neutralisation data can be harvested to identify hotspots of antigenic selection and to predict the impact of vaccination on viral evolution

arXiv.org e-Print Archive

Fierce selection and interference in B-cell repertoire response to chronic HIV-1

Author: Mora Thierry
Nourmohammad Armita
Otwinowski Jakub
Walczak Aleksandra M
Łuksza Marta
Publication venue
Publication date: 26/02/2018
Field of study

During chronic infection, HIV-1 engages in a rapid coevolutionary arms race with the host's adaptive immune system. While it is clear that HIV exerts strong selection on the adaptive immune system, the characteristics of the somatic evolution that shape the immune response are still unknown. Traditional population genetics methods fail to distinguish chronic immune response from healthy repertoire evolution. Here, we infer the evolutionary modes of B-cell repertoires and identify complex dynamics with a constant production of better B-cell receptor mutants that compete, maintaining large clonal diversity and potentially slowing down adaptation. A substantial fraction of mutations that rise to high frequencies in pathogen engaging CDRs of B-cell receptors (BCRs) are beneficial, in contrast to many such changes in structurally relevant frameworks that are deleterious and circulate by hitchhiking. We identify a pattern where BCRs in patients who experience larger viral expansions undergo stronger selection with a rapid turnover of beneficial mutations due to clonal interference in their CDR3 regions. Using population genetics modeling, we show that the extinction of these beneficial mutations can be attributed to the rise of competing beneficial alleles and clonal interference. The picture is of a dynamic repertoire, where better clones may be outcompeted by new mutants before they fix

arXiv.org e-Print Archive

MPG.PuRe

Hal-Diderot

Significance analysis and statistical mechanics: an application to clustering

Author: A. Engel
A. P. Gasch
I. T. Jolliffe
J. B. MacQueen
Johannes Berg
M. Mézard
Marta Łuksza
Michael Lässig
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2010
Field of study

This paper addresses the statistical significance of structures in random data: Given a set of vectors and a measure of mutual similarity, how likely does a subset of these vectors form a cluster with enhanced similarity among its elements? The computation of this cluster p-value for randomly distributed vectors is mapped onto a well-defined problem of statistical mechanics. We solve this problem analytically, establishing a connection between the physics of quenched disorder and multiple testing statistics in clustering and related problems. In an application to gene expression data, we find a remarkable link between the statistical significance of a cluster and the functional relationships between its genes.Comment: to appear in Phys. Rev. Let

arXiv.org e-Print Archive

Crossref

Kölner UniversitätsPublikationsServer

Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer

Author: Abu-Akeel Mohsen
Addala Venkateswar
Allen Peter J.
Andrews Lesley
Arena Jennifer
Arshi Mehreen
Asghari Ray
Askan Gokce
Attiyeh Marc
Bailey Peter
Balachandran Vinod P.
Ballal Mo
Barbour Andrew P.
Bassi Claudio
Basturk Olca
Beghelli Stefania
Beilin Maria
Bhanot Umesh
Biankin Andrew V.
Brooke-Smith Mark E.
Bruxner Tim
Cary Charles Ian Ormsby
Chan Timothy A.
Chang David K.
Chantrill Lorraine A.
Chantrill Lorraine A.
Chen John
Chin Venessa T.
Chou Angela
Chou Angela
Christ Angelika
Clouston Andrew D.
Cooper Caroline L.
Corbo Vincenzo
Cosman Peter H.
Das Amitabha
DeMatteo Ronald P.
Drury Ali
Epari Krishna P.
Eshleman James R.
Fawcett Jonathan W.
Fearon Douglas T.
Feeney Kynan
Fletcher David R.
Forest Cindy
Froio Danielle
Gill Anthony J.
Gill Anthony J.
Gnjatic Sacha
Goodwin Annabel
Goodwin Annabel
Grbovic-Huezo Olivera
Greenbaum Benjamin D.
Grimison Peter
Grimmond Sean M.
Gönen Mithat
Hatzifotis Michael
Herbst Brian
Hermann David
High Hilda A.
Hodgin Mary
Hodgkinson Peter
Hofmann Oliver
Holmes Oliver
Hruban Ralph H.
Humphris Jeremy L.
Iacobuzio-Donahue Christine A.
Ismail Kasim
James Virginia
Jamieson Nigel B.
Johns Amber L.
Jones Marc D.
Kazakoff Stephen
Kelley Z. Larkin
Kench James G.
Kirk Judy
Lam Vincent W.
Lawlor Rita T.
Leach Steven D.
Leonard Conrad
Levine Arnold J.
Loo Jennifer
Makarov Vladimir
Martin Patrick
Martin Sancha
McKay Skye H.
McLeod Duncan
Mead R. Scott
Mead R. Scott
Medina Benjamin
Merad Miriam
Merghoub Taha
Merrett Neil D.
Mittal Anubhav
Moral John Alec
Morgan Ashleigh
Mukhedkar Sanjay
Mukhopadhyay Pamela
Musgrove Elizabeth A.
Nagrial Adnan M.
Newell Felicity
Nguyen Nan Q.
Nikfarjam Mehrdad
Nones Katia
Nourse Craig
O’Connor Chelsie
O’Rourke Thomas J.
Pajic Marina
Papangelis Virginia
Patch Ann-Marie
Pavey Darren
Pavlakis Nick
Pearson John V.
Pinese Mark
Pinho Andreia V.
Remark Romain
Riaz Nadeem
Ruszkiewicz Andrew R.
Saglimbeni Joseph
Samra Jaswinder S.
Sandroussi Charbel
Scardoni Maria
Scarpa Aldo
Senbabaoglu Yasin
Slater Kellee
Smoragiewicz Martin
Spigelman Allan
Steinmann Angela
Stoita Alina
Texler Michael
Timpson Paul
Tucker Katherine
Vennin Claire
Waddell Nicola
Warren Sean
Wells Daniel K.
Williams David
Wilson Peter J.
Wolchok Jedd D.
Wolfgang Christopher L.
Wood Scott
Worthley Chris
Wu Jianmin
Xu Christina
Zappasodi Roberta
Zeps Nikolajs
Zhang Jennifer
Zhao Julia N.
Łuksza Marta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2017
Field of study

Pancreatic ductal adenocarcinoma is a lethal cancer with fewer than 7% of patients surviving past 5 years. T-cell immunity has been linked to the exceptional outcome of the few long-term survivors1,2, yet the relevant antigens remain unknown. Here we use genetic, immunohistochemical and transcriptional immunoprofiling, computational biophysics, and functional assays to identify T-cell antigens in long-term survivors of pancreatic cancer. Using whole-exome sequencing and in silico neoantigen prediction, we found that tumours with both the highest neoantigen number and the most abundant CD8+ T-cell infiltrates, but neither alone, stratified patients with the longest survival. Investigating the specific neoantigen qualities promoting T-cell activation in long-term survivors, we discovered that these individuals were enriched in neoantigen qualities defined by a fitness model, and neoantigens in the tumour antigen MUC16 (also known as CA125). A neoantigen quality fitness model conferring greater immunogenicity to neoantigens with differential presentation and homology to infectious disease-derived peptides identified long-term survivors in two independent datasets, whereas a neoantigen quantity model ascribing greater immunogenicity to increasing neoantigen number alone did not. We detected intratumoural and lasting circulating T-cell reactivity to both high-quality and MUC16 neoantigens in long-term survivors of pancreatic cancer, including clones with specificity to both high-quality neoantigens and predicted cross-reactive microbial epitopes, consistent with neoantigen molecular mimicry. Notably, we observed selective loss of high-quality and MUC16 neoantigenic clones on metastatic progression, suggesting neoantigen immunoediting. Our results identify neoantigens with unique qualities as T-cell targets in pancreatic ductal adenocarcinoma. More broadly, we identify neoantigen quality as a biomarker for immunogenic tumours that may guide the application of immunotherapies

Enlighten

Cluster-Statistik und Genexpressionanalyse

Author: Łuksza Marta
Publication venue
Publication date: 01/01/2012
Field of study

Clustering, which involves dividing data elements into classes based on their observed properties, is one of the main tools in exploratory data analysis. It is used widely in the analysis of gene expression, where one searches for structures related to the underlying biological mechanisms. Clusters of gene expression patterns are a signature of a common regulatory process of the involved genes. Clusters of experimental conditions, e.g. tissues in an organism, imply similar states of cell differentiation. The latter property is used in the tumour sample classification. This thesis establishes a statistical grounding for cluster analysis in high-dimensional data. The methods used in the thesis are strongly influenced by solutions from the field of statistical mechanics. The basic concepts and computational methods of statistical mechanics are summarised in Chapter 2. In Chapter 3, we propose probabilistic models for vectors in high-dimensional real space. Motivated by the characteristics of gene expression data, we discuss different properties defining a cluster: point density, positional bias, and directional density (defined in Chapter 3). These properties are related to different choices of a similarity measure and of a background distribution for unclustered vectors. We consider several combinations of such background distributions and similarity measures, and we arrive at well-defined scoring schemes for clusters. Clusters in data usually arise due to an underlying functional mechanism. However, even unrelated vectors drawn from the background distribution can form agglomerations which by chance resemble clusters and yield high cluster scores. In Chapter 4, we address the problem of the statistical significance of clusters. For the scoring schemes proposed in Chapter 3, we compute the cluster score p-value, which tells how likely it is to observe a group of random vectors with the same or higher score. Our analytical solution is based on a mapping to a problem from the statistical mechanics of disordered systems. In an application to yeast gene expression data, we show that the cluster score p-value is in agreement with the biological significance of clustered genes, as measured by enrichment of considered clusters in gene ontology terms (i.e. known functional annotations of genes). In Chapter 5, we focus on another important aspect of the statistics of high-dimensional data: dependencies between vector components. Such dependencies are prevalent in gene expression data, for example between subsequent time points in time-course experiments. Correct estimation of such dependencies is crucial both for clustering of experimental conditions, and for computation of similarities of gene expression vectors. Here, we show that the estimation of vector-component dependencies requires accounting for an important confounding factor: the presence of clusters of data vectors. We propose a mixture-model-based inference method, which disentangles the spurious effect of clusters from the true signal. We successfully apply our method to the problem of tumour sample classification. In Chapter 6, we propose the significance-based clustering algorithm. The algorithm seeks the best representation of data as a mixture of the background and of clusters characterised by a statistically significant score. In the implementation of this approach, we draw from all concepts discussed in the preceding chapters of this thesis: In the process of finding clusters of vectors, the algorithm estimates the metric which accounts for dependencies between components of the vectors. Further, using the probabilistic framework of the mixture-model, it assigns low prior probability, and effectively penalises, clusters with high cluster score p-value. In application to gene-expression data of yeast and human, we show that the significance-constraint improves the biological significance of resulting clusters.Clustering, das Gruppieren von Datenpunkten aufgrund ihrer beobachteten Eigenschaften, ist eines der wichtigsten Werkzeuge in der Datenanalyse. Es wird haeufig in der Analyse von Genexpressionsdaten verwendet, um Gene zu identifizieren, die aehnliche biologischen Funktionen haben. Cluster von Genexpressionsmustern lassen oft auf einen gemeinsamen regulatorischen Prozess der beteiligten Gene schliessen. Cluster von experimentellen Bedingungen, z.B. von unterschiedlichen Geweben in einem Organismus, sind ein Hinweis auf einen aehnlichen Zustand der Zelldifferenzierung. Die zuletzt genannte Eigenschaft wird haeufig zur Klassifikation von Tumordaten verwendet. Diese Dissertation etabliert statistische Grundlagen fuer Clustering in hochdimensionalen Daten. Die neu eingefuehrten Methoden basieren zu grossen Teilen auf Erkenntnissen der statistischen Mechanik. Zuerst werden deshalb in Kapitel 2 grundlegende Konzepte und Algorithmen der statistischen Mechanik eingefuehrt. In Kapitel 3 wird ein neues probabilistisches Model fuer Cluster im hochdimensionalen realen Raum vorgeschlagen. Motiviert durch die Merkmale von Genexpressionsdaten werden verschiedene Observablen eines Clusters definiert: Punktdichte, Positions-Bias und Richtungsdichte. Diese Observablen messen in verschiedener Weise Aehnlichkeiten zwischen Datenpunkten und beschreiben die Hintergrundverteilung zufaelliger Datenpunkte. Daraus wird eine sogenannte Score-Funktionen fuer Cluster abgeleitet. Obwohl Gene mit aehnlicher Funktion mit hoher Wahrscheinlichkeit Cluster in Genexpressionsdaten bilden, koennen auch zufaellig verteilte Datenvektoren Cluster bilden und hohe Cluster-Scores erhalten. In Kapitel 4 wird deshalb die statistische Signifikanz fuer Cluster behandelt. Fuer die Score-Funktionen aus Kapitel 3 werden Verfahren zur Berechnung eines sogenannten p-Wertes vorgestellt. Der Funktion p(S) gibt die Wahrscheinlickeit an, dass Zufallsvektoren einen Cluster-Score von mindestens S erhalten. Dieses Problem wir mit Methoden der statistischen Mechanik ungeordenter Systeme behandelt, die zu einer analytischen Loesung fuehren. In einer Anwendung auf Genexpressionsdaten aus Hefe wird gezeigt, dass Cluster- Scores p-Werte biologische Signifikanz von co-exprimierten Genen widerspiegeln; die biologische Signifikanz wird hierbei durch Gen-Ontologie- Parameter in den betrachteten Clustern gemessen. Dies zeigt, dass Gene mit aehnlichen biologischen Funktionen in der Tat als signifikante Cluster identifiert werden. In Kapitel 5 wird ein weiterer wichtiger Aspekt statistischer Methoden fuer hochdimensionale Daten behandelt: Abhaengigkeiten zwischen Vektorkomponenten. Solche Abhaengigkeiten sind haeufig in Genexpressiondaten zu finden, beispielweise verursacht durch zeitlich aufeinanderfolgende Experimente im Rahmen von Zeitreihenexperimenten. Eine korrekte Abschaetzung solcher Abhaengigkeiten ist sowohl fuer das Clustering von experimentellen Bedingungen als auch zur Berechnung der Aehnlichkeiten von Genen von entscheidender Bedeutung. Fuer die Abschaetzung von Abhaengigkeiten von Vektorkomponenten ist die Beruecksichtigung eines wichtigen Stoerfaktors notwendig: das Vorhandensein von Clustern von Datenvektoren. Wir schlagen eine Inferenzmethode basierend auf einer Mischverteilung vor, welche das zufaellige Auftreten von Clustern vom wahren Signal trennt. In unserem Ansatz verwenden wir die probabilistischen Modelle fuer Cluster aus Kapitel 3. Wir wenden diese Methode auf das Problem der Tumorprobenklassifizierung an. In Kapitel 6 wird der Algorithmus zur Berechnung von signifikanzbasiertem Clustering vorgestellt. Der Algorithmus sucht die beste Zerlegung der Daten als Mischung von zufaelligen Datenvektoren (aus der Hintergrundverteilung) und statistisch signifikanten Clustern im Sinne unserer Theorie. Beim Auffinden von Clustern von Datenvektoren schaetzt der Algorithmus ab, welches Aehnlichkeitsmass die Abhaengigkeiten zwischen Vektorkomponenten am besten bescheibt. Des weiteren erlaubt die probabilistische Mischverteilung die Verwendung von Ausgangwahrscheinlichkeiten, die Cluster mit grossen p-Werten bestraft. In einer Anwendung auf Genexpressionsdaten von Hefe und Mensch wird gezeigt, dass dieser Mischverteilungs-Ansatz die biologische Signifikanz der erhaltenen Cluster erhoeht

Institutional Repository of the Freie Universität Berlin

Can we read the future from a tree?

Author: Michael Lässig
Marta Łuksza
Publication venue: eLife Sciences Organisation, Ltd.
Publication date: 01/01/2014
Field of study

Overall view, front from above; This marble bust of the lover of the emperor Hadrian (ruled 117-138) was present in the Louvre’s Salle des Antiques in 1793. It was long confused with a bronze sculpture confiscated from the Château d’Écouen in the same year, then transferred to Versailles before later joining the Louvre. (Antiquities from the royal residences were confiscated with the property of the Crown in 1792 and put on display at the Louvre.) Both sculptures reproduce a bust found during the Renaissance that probably came from the Villa Hadriana. The Louvre bust comes from the French royal collections and is an 18th-century copy of the Roman work. Source: Louvre Museum [website]; http://www.louvre.fr/ (accessed 4/15/2011

Crossref

MIT Libraries Dome

Two-Stage Model-Based Clustering for Liquid Chromatography Mass Spectrometry Data Analysis

Author: Gambin Anna
Karczmarski Jakub
Kluge Bogusław
Ostrowski Jerzy
Łuksza Marta
Publication venue
Publication date
Field of study

Proteomic mass spectrometry is gaining an increasing role in diagnostics and in studies on protein complexes and biological systems. This experimental technology is producing high-throughput data which is inherently noisy and may contain various errors. Mathematical processing can help in removing them.

Research Papers in Economics

Predictive Modeling of Influenza Shows the Promise of Applied Evolutionary Biology

Author: Bedford Trevor
Gostic Katelyn M.
Grenfell Bryan T.
Lässig Michael
McCauley John W.
Morris Dylan H.
Neher Richard A.
Pompei Simone
Łuksza Marta
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Seasonal influenza is controlled through vaccination campaigns. Evolution of influenza virus antigens means that vaccines must be updated to match novel strains, and vaccine effectiveness depends on the ability of scientists to predict nearly a year in advance which influenza variants will dominate in upcoming seasons. In this review, we highlight a promising new surveillance tool: predictive models. Developed through data-sharing and close collaboration between the World Health Organization and academic scientists, these models use surveillance data to make quantitative predictions regarding influenza evolution. Predictive models demonstrate the potential of applied evolutionary biology to improve public health and disease control. We review the state of influenza predictive modeling and discuss next steps and recommendations to ensure that these models deliver upon their considerable biomedical promise

Crossref

Kölner UniversitätsPublikationsServer

edoc

Fierce Selection and Interference in B-Cell Repertoire Response to Chronic HIV-1

Author: Aleksandra M Walczak
Armita Nourmohammad
Berek
Bolotin
Bolotin
Burnet
Campbell
Caskey
Desai
DeWitt
Elhanati
Hoehn
Horns
Jakub Otwinowski
Jensen
Kimura
Kingman
Laserson
Liao
Lässig
Marta Łuksza
McCoy
McDonald
McMichael
Moore
Mustonen
Neher
Neher
Nourmohammad
Pandit
Price
Ralph
Richman
Roberts
Schiffels
Smith
SPARTAC Trial Investigators
Stamatakis
Strelkowa
Takahata
Thierry Mora
Thomas Leitner
Uduman
Victora
Vieira
Vollmers
Weinstein
Yaari
Zanini
Łuksza
Łuksza
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Recommended from our members

A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy

Author: Balachandran Vinod P.
Chan Timothy A.
Greenbaum Benjamin D.
Hellmann Matthew D.
Levine Arnold J.
Makarov Vladimir
Merghoub Taha
Riaz Nadeem
Rizvi Naiyer A.
Solovyov Alexander
Wolchok Jedd D.
Łuksza Marta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/11/2017
Field of study

Checkpoint blockade immunotherapies enable the host immune system to recognize and destroy tumour cells. Their clinical activity has been correlated with activated T-cell recognition of neoantigens, which are tumour-specific, mutated peptides presented on the surface of cancer cells. Here we present a fitness model for tumours based on immune interactions of neoantigens that predicts response to immunotherapy. Two main factors determine neoantigen fitness: the likelihood of neoantigen presentation by the major histocompatibility complex (MHC) and subsequent recognition by T cells. We estimate these components using the relative MHC binding affinity of each neoantigen to its wild type and a nonlinear dependence on sequence similarity of neoantigens to known antigens. To describe the evolution of a heterogeneous tumour, we evaluate its fitness as a weighted effect of dominant neoantigens in the subclones of the tumour. Our model predicts survival in anti-CTLA-4-treated patients with melanoma and anti-PD-1-treated patients with lung cancer. Importantly, low-fitness neoantigens identified by our method may be leveraged for developing novel immunotherapies. By using an immune fitness model to study immunotherapy, we reveal broad similarities between the evolution of tumours and rapidly evolving pathogens

Princeton University Open Access Repository

Crossref